Performance Analysis of Centralized versus Distributed Recovery Schemes in P2P Storage Systems
نویسندگان
چکیده
This report studies the performance of Peer-to-Peer Storage Systems (P2PSS) in terms of data lifetime and availability. Two schemes for recovering lost data are modeled through absorbing Markov chains and their performance are evaluated and compared. The first scheme relies on a centralized controller that can recover multiple losses at once, whereas the second scheme is distributed and recovers one loss at a time. The impact of each system parameter on the performance is evaluated, and guidelines are derived on how to engineer the system and tune its key parameters in order to provide desired lifetime and/or availability of data. We find that, in stable environments such as local area or research laboratory networks where machines are usually highly available, the distributed-repair scheme offers a reliable, scalable and cheap storage/backup solution. This is in contrast with the case of highly dynamic environments, where the distributed-repair scheme is inefficient as long as the storage overhead is kept reasonable. P2PSS with centralized-repair scheme are efficient in any environment but have the disadvantage of relying on a centralized authority. Our analysis also suggests that the use of large size fragments reduces the efficiency of the recovery mechanism. Key-words: peer-to-peer storage systems, recovery process, absorbing continuoustime Markov chain, performance evaluation in ria -0 03 46 50 3, v er si on 1 11 D ec 2 00 8 Analyse de performance de mécanismes centralisé et distribué de récupération de données dans des systèmes pair-à-pair dédiés au stockage Résumé : Ce rapport étudie les performances des systèmes pair-à-pair dédiés au stockage (archivage ou sauvegarde) en termes de durée de vie des données et de leur disponibilité. Deux mécanismes de récupération des données perdues sont modélisés par une chaı̂ne de Markov absorbante et leurs performances sont évaluées et comparées. Le premier mécanisme nécéssite l’utilisation d’un serveur, pouvant ainsi récupérer plusieurs données à la fois, alors que le second mécanisme est distribué mais ne récupère qu’une seule donnée perdue à la fois. Nous évaluons l’impact de chaque paramètre sur la performance du système et montrons comment nos résultats peuvent être utilisés de sorte à garantir que la qualité de service pré-requise soit pourvue. Nous constatons que, dans des environnements stables, tels que les réseaux locaux ou ceux dédiés à la recherche, où les machines sont généralement hautement disponibles, la réparation distribuée offre une solution de sauvegarde fiable, performante et peu coûteuse. Toutefois, quand la dynamique des pairs est prononcée, la réparation distribuée devient peu efficace, surtout à des niveaux de redondance raisonnables. Quant au mécanisme centralisé de réparation des données, il s’est avéré efficace dans n’importe quel environnement, l’inconvénient étant de reposer sur l’utilisation d’un serveur centralisé, avec ce que ceci implique de gestion et de sensibilité aux attaques et défailances. Notre étude suggère aussi qu’une fragmentation grossière des données réduirait l’efficacité du mécanisme de récupération. Mots-clés : systèmes de stockage pair-à-pair, réparation des données, chaı̂ne de Markov absorbante en temps continu, évaluation de performance in ria -0 03 46 50 3, v er si on 1 11 D ec 2 00 8 Performance Analysis of Recovery Schemes in P2P Storage Systems 3
منابع مشابه
Lifetime and availability of data stored on a P2P system: Evaluation of redundancy and recovery schemes
This paper studies the performance of Peer-to-Peer storage and backup systems (P2PSS). These systems are based on three pillars: data fragmentation and dissemination among the peers, redundancy mechanisms to cope with peers churn and repair mechanisms to recover lost or temporarily unavailable data. Usually, redundancy is achieved either by using replication or by using erasure codes. A new cla...
متن کاملPerformance Analysis of Peer-to-Peer Storage Systems
This paper evaluates the performance of two schemes for recovering lost data in a peer-to-peer (P2P) storage systems. The first scheme is centralized and relies on a server that recovers multiple losses at once, whereas the second one is distributed. By representing the state of each scheme by an absorbing Markov chain, we are able to compute their performance in terms of the delivered data lif...
متن کاملFailure Detection in P2P-Grid System
Peer-to-peer (P2P)–Grid systems are being investigated as a platform for converging the Grid and P2P network in the construction of large-scale distributed applications. The highly dynamic nature of P2P–Grid systems greatly affects the execution of the distributed program. Uncertainty caused by arbitrary node failure and departure significantly affects the availability of computing resources an...
متن کاملP2P Network Trust Management Survey
Peer-to-peer applications (P2P) are no longer limited to home users, and start being accepted in academic and corporate environments. While file sharing and instant messaging applications are the most traditional examples, they are no longer the only ones benefiting from the potential advantages of P2P networks. For example, network file storage, data transmission, distributed computing, and co...
متن کاملHeterogeneous Search in Unstructured Peer-to-Peer Networks
Resource search or discovery is a fundamental issue in peer-to-peer (P2P) and grid studies.1 Search objects, or resources, can be cycles, storage spaces, files, services, addresses, and so on. In general, systems are employing three categories of P2P-network architectures to improve search performance: centralized (such as Napster, http://www.napster.com/), decentralized but structured (such as...
متن کامل